A method of record linkage.

نویسندگان

  • A Oshima
  • F Sakagami
  • A Hanai
  • I Fujimoto
چکیده

In cancer epidemiology, prospective approaches are very important both in testing etiological hypotheses and in evaluating preventive procedures. Prospective studies, however, are very difficult and expensive, because a large number of people and a long period of observation are necessary for a satisfactory study. As a data source for follow-up studies, population-based cancer registry is very useful. The Osaka Cancer Registry has been in operation since December, 1962. Since 1968 the data processing, including the work of collation, has been semicomputerized. In order to identify cancer patients, we use the following six indices: date of birth, first Chinese character of a person's family name, address a: city, ward, town or village, address b: further details. i.e., street, avenue, section, hamlet etc., site, and sex. When we have data on the collation indices for the subjects to be followed up, we can conduct follow-up studies easily and accurately, using a semicomputerized collation method similar to that in the cancer registration system. Because the master file of the Osaka Cancer Registry contains the data of cancer cases reported and all cancer deaths among the residents of Osaka Prefecture, we can follow up the subjects living in Osaka Prefecture and obtain data about vitually all cancer incidences and deaths among them. In this follow-up method by means of record linkage to the cancer registry, some considerations should be taken into account for the following factors; coverage of cancer data in the Osaka Cancer Registry, reliability of the collation method, and address of the subjects to be followed up. As an example of a study with this method, we present the follow-up study of the screenees of a mass screening program for stomach cancer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Linkage of Persian Record with Missing Data

Extended Abstract. When the comprehensive information about a topic is scattered among two or more data sets, using only one of those data sets would lead to information loss available in other data sets. Hence, it is necessary to integrate scattered information to a comprehensive unique data set. On the other hand, sometimes we are interested in recognition of duplications in a data set. The i...

متن کامل

Probabilistic record linkage and a method to calculate the positive predictive value.

BACKGROUND Computerized record linkage is commonly used in cohort studies to ascertain the study outcome, and as such its accuracy classifying the outcome can be described using the standard epidemiological terms of sensitivity and positive predictive value (PPV). METHOD We describe a 'duplicate method' to calculate the PPV of record linkage when each record can only be involved in one match ...

متن کامل

A Decision Tree Based Record Linkage for Recommendation Systems

Record linkage merges all the records relating to the same entity from multiple datasets, at the entity level. It is the initial data preparation phase for most of the database projects. Traditionally one to one data linkage is performed among the entities of same type with common unique identifier. The proposed one to many and/or many to many record linkage method is able to link the entities ...

متن کامل

Private record linkage with Bloom filters

In many record linkage applications, identifiers have to be encrypted to preserve privacy. Therefore, a method for approximate string comparison in private record linkage is needed. We describe a new method of approximate string comparison in private record linkage. The main idea is to store q-grams sets derived from identifier values in Bloom filters and compare them bitwise across databases. ...

متن کامل

Leveraging Social Media Signals for Record Linkage

Many data-intensive applications collect (structured) data from a variety of sources. A key task in this process is record linkage, which is the problem of determining the records from these sources that refer to the same real-world entities. Traditional approaches use the record representation of entities to accomplish this task. With the nascence of social media, entities on the Web are now a...

متن کامل

Privacy Preserving Probabilistic Record Linkage (P3RL): a novel method for linking existing health-related data and maintaining participant confidentiality

BACKGROUND Record linkage of existing individual health care data is an efficient way to answer important epidemiological research questions. Reuse of individual health-related data faces several problems: Either a unique personal identifier, like social security number, is not available or non-unique person identifiable information, like names, are privacy protected and cannot be accessed. A s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Environmental Health Perspectives

دوره 32  شماره 

صفحات  -

تاریخ انتشار 1979